Overview

Dataset statistics

Number of variables8
Number of observations32065
Missing cells22385
Missing cells (%)8.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.5 MiB
Average record size in memory211.5 B

Variable types

Categorical3
Numeric5

Alerts

Province/State has a high cardinality: 80 distinct values High cardinality
Country/Region has a high cardinality: 188 distinct values High cardinality
Date has a high cardinality: 121 distinct values High cardinality
Confirmed is highly correlated with Deaths and 1 other fieldsHigh correlation
Deaths is highly correlated with Confirmed and 1 other fieldsHigh correlation
Recovered is highly correlated with Confirmed and 1 other fieldsHigh correlation
Confirmed is highly correlated with Deaths and 1 other fieldsHigh correlation
Deaths is highly correlated with Confirmed and 1 other fieldsHigh correlation
Recovered is highly correlated with Confirmed and 1 other fieldsHigh correlation
Confirmed is highly correlated with Deaths and 1 other fieldsHigh correlation
Deaths is highly correlated with Confirmed and 1 other fieldsHigh correlation
Recovered is highly correlated with Confirmed and 1 other fieldsHigh correlation
Province/State is highly correlated with Lat and 2 other fieldsHigh correlation
Lat is highly correlated with Province/State and 1 other fieldsHigh correlation
Long is highly correlated with Province/State and 1 other fieldsHigh correlation
Confirmed is highly correlated with Deaths and 1 other fieldsHigh correlation
Deaths is highly correlated with Confirmed and 1 other fieldsHigh correlation
Recovered is highly correlated with Province/State and 2 other fieldsHigh correlation
Province/State has 22385 (69.8%) missing values Missing
Confirmed is highly skewed (γ1 = 22.83271239) Skewed
Province/State is uniformly distributed Uniform
Date is uniformly distributed Uniform
Lat has 363 (1.1%) zeros Zeros
Long has 363 (1.1%) zeros Zeros
Confirmed has 10302 (32.1%) zeros Zeros
Deaths has 17916 (55.9%) zeros Zeros
Recovered has 16088 (50.2%) zeros Zeros

Reproduction

Analysis started2022-08-01 20:24:18.544844
Analysis finished2022-08-01 20:26:22.282847
Duration2 minutes and 3.74 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Province/State
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct80
Distinct (%)0.8%
Missing22385
Missing (%)69.8%
Memory size1.3 MiB
Ningxia
 
121
New Caledonia
 
121
Mayotte
 
121
Guadeloupe
 
121
French Polynesia
 
121
Other values (75)
9075 

Length

Max length28
Median length24
Mean length10.6375
Min length5

Characters and Unicode

Total characters102971
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAustralian Capital Territory
2nd rowNew South Wales
3rd rowNorthern Territory
4th rowQueensland
5th rowSouth Australia

Common Values

ValueCountFrequency (%)
Ningxia121
 
0.4%
New Caledonia121
 
0.4%
Mayotte121
 
0.4%
Guadeloupe121
 
0.4%
French Polynesia121
 
0.4%
French Guiana121
 
0.4%
Greenland121
 
0.4%
Faroe Islands121
 
0.4%
Zhejiang121
 
0.4%
Yunnan121
 
0.4%
Other values (70)8470
 
26.4%
(Missing)22385
69.8%

Length

length histogram
Histogram of lengths of the category
ValueCountFrequency (%)
islands726
 
5.0%
new363
 
2.5%
and363
 
2.5%
territory242
 
1.7%
british242
 
1.7%
french242
 
1.7%
south242
 
1.7%
saint242
 
1.7%
australia242
 
1.7%
princess242
 
1.7%
Other values (94)11374
78.3%

Most occurring characters

ValueCountFrequency (%)
a12221
 
11.9%
n11616
 
11.3%
i9196
 
8.9%
e6413
 
6.2%
r5929
 
5.8%
4840
 
4.7%
o4598
 
4.5%
s4598
 
4.5%
u4235
 
4.1%
t4235
 
4.1%
Other values (44)35090
34.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter83853
81.4%
Uppercase Letter14036
 
13.6%
Space Separator4840
 
4.7%
Close Punctuation121
 
0.1%
Open Punctuation121
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a12221
14.6%
n11616
13.9%
i9196
11.0%
e6413
 
7.6%
r5929
 
7.1%
o4598
 
5.5%
s4598
 
5.5%
u4235
 
5.1%
t4235
 
5.1%
l3993
 
4.8%
Other values (16)16819
20.1%
Uppercase Letter
ValueCountFrequency (%)
S1573
11.2%
M1331
 
9.5%
I1089
 
7.8%
G1089
 
7.8%
N968
 
6.9%
C968
 
6.9%
T847
 
6.0%
H847
 
6.0%
A847
 
6.0%
B726
 
5.2%
Other values (15)3751
26.7%
Space Separator
ValueCountFrequency (%)
4840
100.0%
Close Punctuation
ValueCountFrequency (%)
)121
100.0%
Open Punctuation
ValueCountFrequency (%)
(121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin97889
95.1%
Common5082
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a12221
 
12.5%
n11616
 
11.9%
i9196
 
9.4%
e6413
 
6.6%
r5929
 
6.1%
o4598
 
4.7%
s4598
 
4.7%
u4235
 
4.3%
t4235
 
4.3%
l3993
 
4.1%
Other values (41)30855
31.5%
Common
ValueCountFrequency (%)
4840
95.2%
)121
 
2.4%
(121
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII102971
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a12221
 
11.9%
n11616
 
11.3%
i9196
 
8.9%
e6413
 
6.2%
r5929
 
5.8%
4840
 
4.7%
o4598
 
4.5%
s4598
 
4.5%
u4235
 
4.1%
t4235
 
4.1%
Other values (44)35090
34.1%

Country/Region
Categorical

HIGH CARDINALITY

Distinct188
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
China
3993 
Canada
 
1694
United Kingdom
 
1331
France
 
1331
Australia
 
968
Other values (183)
22748 

Length

Max length32
Median length22
Mean length8.083018868
Min length2

Characters and Unicode

Total characters259182
Distinct characters57
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAfghanistan
2nd rowAlbania
3rd rowAlgeria
4th rowAndorra
5th rowAngola

Common Values

ValueCountFrequency (%)
China3993
 
12.5%
Canada1694
 
5.3%
United Kingdom1331
 
4.2%
France1331
 
4.2%
Australia968
 
3.0%
Netherlands484
 
1.5%
Denmark363
 
1.1%
Afghanistan121
 
0.4%
Russia121
 
0.4%
Rwanda121
 
0.4%
Other values (178)21538
67.2%

Length

length histogram
Histogram of lengths of the category
ValueCountFrequency (%)
china3993
 
10.2%
canada1694
 
4.3%
united1452
 
3.7%
kingdom1331
 
3.4%
france1331
 
3.4%
australia968
 
2.5%
and847
 
2.2%
netherlands484
 
1.2%
denmark363
 
0.9%
guinea363
 
0.9%
Other values (210)26378
67.3%

Most occurring characters

ValueCountFrequency (%)
a42350
16.3%
n25289
 
9.8%
i24321
 
9.4%
e16698
 
6.4%
r13310
 
5.1%
o11495
 
4.4%
d10406
 
4.0%
t9438
 
3.6%
u7623
 
2.9%
C7623
 
2.9%
Other values (47)90629
35.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter212355
81.9%
Uppercase Letter38720
 
14.9%
Space Separator7139
 
2.8%
Dash Punctuation242
 
0.1%
Open Punctuation242
 
0.1%
Close Punctuation242
 
0.1%
Other Punctuation242
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a42350
19.9%
n25289
11.9%
i24321
11.5%
e16698
 
7.9%
r13310
 
6.3%
o11495
 
5.4%
d10406
 
4.9%
t9438
 
4.4%
u7623
 
3.6%
h7381
 
3.5%
Other values (16)44044
20.7%
Uppercase Letter
ValueCountFrequency (%)
C7623
19.7%
S3630
 
9.4%
B2662
 
6.9%
A2662
 
6.9%
K2299
 
5.9%
M2178
 
5.6%
U2057
 
5.3%
G1815
 
4.7%
N1694
 
4.4%
F1694
 
4.4%
Other values (15)10406
26.9%
Other Punctuation
ValueCountFrequency (%)
*121
50.0%
'121
50.0%
Space Separator
ValueCountFrequency (%)
7139
100.0%
Dash Punctuation
ValueCountFrequency (%)
-242
100.0%
Open Punctuation
ValueCountFrequency (%)
(242
100.0%
Close Punctuation
ValueCountFrequency (%)
)242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin251075
96.9%
Common8107
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a42350
16.9%
n25289
 
10.1%
i24321
 
9.7%
e16698
 
6.7%
r13310
 
5.3%
o11495
 
4.6%
d10406
 
4.1%
t9438
 
3.8%
u7623
 
3.0%
C7623
 
3.0%
Other values (41)82522
32.9%
Common
ValueCountFrequency (%)
7139
88.1%
-242
 
3.0%
(242
 
3.0%
)242
 
3.0%
*121
 
1.5%
'121
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII259182
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a42350
16.3%
n25289
 
9.8%
i24321
 
9.4%
e16698
 
6.4%
r13310
 
5.1%
o11495
 
4.4%
d10406
 
4.0%
t9438
 
3.6%
u7623
 
2.9%
C7623
 
2.9%
Other values (47)90629
35.0%

Lat
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct256
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.18189118
Minimum-51.7963
Maximum71.7069
Zeros363
Zeros (%)1.1%
Negative5687
Negative (%)17.7%
Memory size250.6 KiB
Mini histogram

Quantile statistics

Minimum-51.7963
5-th percentile-28.0167
Q16.877
median23.6345
Q341.1533
95-th percentile55.1694
Maximum71.7069
Range123.5032
Interquartile range (IQR)34.2763

Descriptive statistics

Standard deviation24.90426049
Coefficient of variation (CV)1.175733568
Kurtosis-0.1946022785
Mean21.18189118
Median Absolute Deviation (MAD)17.2064
Skewness-0.545288114
Sum679197.3407
Variance620.2221908
MonotonicityNot monotonic
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0363
 
1.1%
33242
 
0.8%
13.1939242
 
0.8%
36242
 
0.8%
-4.0383242
 
0.8%
24242
 
0.8%
52.9399242
 
0.8%
21242
 
0.8%
46.8852121
 
0.4%
28.1667121
 
0.4%
Other values (246)29766
92.8%
ValueCountFrequency (%)
-51.7963121
0.4%
-41.4545121
0.4%
-40.9006121
0.4%
-38.4161121
0.4%
-37.8136121
0.4%
-35.6751121
0.4%
-35.4735121
0.4%
-34.9285121
0.4%
-33.8688121
0.4%
-32.5228121
0.4%
ValueCountFrequency (%)
71.7069121
0.4%
64.9631121
0.4%
64.8255121
0.4%
64.2823121
0.4%
64121
0.4%
63121
0.4%
61.8926121
0.4%
60.472121
0.4%
60121
0.4%
58.5953121
0.4%

Long
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct259
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.88119533
Minimum-135
Maximum178.065
Zeros363
Zeros (%)1.1%
Negative10406
Negative (%)32.5%
Memory size250.6 KiB
Mini histogram

Quantile statistics

Minimum-135
5-th percentile-85.2072
Q1-15.3101
median21.0059
Q378
95-th percentile128
Maximum178.065
Range313.065
Interquartile range (IQR)93.3101

Descriptive statistics

Standard deviation70.24552349
Coefficient of variation (CV)3.070011093
Kurtosis-0.7784427299
Mean22.88119533
Median Absolute Deviation (MAD)52.2148
Skewness-0.007872807144
Sum733685.5282
Variance4934.43357
MonotonicityNot monotonic
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0363
 
1.1%
21.7587242
 
0.8%
30242
 
0.8%
-59.5432242
 
0.8%
9242
 
0.8%
8.6753121
 
0.4%
57121
 
0.4%
8.4689121
 
0.4%
21.7453121
 
0.4%
8.0817121
 
0.4%
Other values (249)30129
94.0%
ValueCountFrequency (%)
-135121
0.4%
-124.8457121
0.4%
-123.1207121
0.4%
-122.6655121
0.4%
-116.5765121
0.4%
-106.4509121
0.4%
-102.5528121
0.4%
-98.8139121
0.4%
-95.7129121
0.4%
-90.2308121
0.4%
ValueCountFrequency (%)
178.065121
0.4%
174.886121
0.4%
165.618121
0.4%
153.4121
0.4%
151.2093121
0.4%
149.4068121
0.4%
149.0124121
0.4%
145.9707121
0.4%
144.9631121
0.4%
143.9555121
0.4%

Date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct121
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
1/22/20
 
265
3/23/20
 
265
4/20/20
 
265
4/19/20
 
265
4/18/20
 
265
Other values (116)
30740 

Length

Max length7
Median length7
Mean length6.702479339
Min length6

Characters and Unicode

Total characters214915
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1/22/20
2nd row1/22/20
3rd row1/22/20
4th row1/22/20
5th row1/22/20

Common Values

ValueCountFrequency (%)
1/22/20265
 
0.8%
3/23/20265
 
0.8%
4/20/20265
 
0.8%
4/19/20265
 
0.8%
4/18/20265
 
0.8%
4/17/20265
 
0.8%
4/16/20265
 
0.8%
4/15/20265
 
0.8%
4/14/20265
 
0.8%
4/13/20265
 
0.8%
Other values (111)29415
91.7%

Length

length histogram
Histogram of lengths of the category
ValueCountFrequency (%)
1/22/20265
 
0.8%
1/23/20265
 
0.8%
1/24/20265
 
0.8%
1/25/20265
 
0.8%
1/26/20265
 
0.8%
1/27/20265
 
0.8%
1/28/20265
 
0.8%
1/29/20265
 
0.8%
1/30/20265
 
0.8%
1/31/20265
 
0.8%
Other values (111)29415
91.7%

Most occurring characters

ValueCountFrequency (%)
/64130
29.8%
253530
24.9%
034980
16.3%
116960
 
7.9%
312720
 
5.9%
411130
 
5.2%
58745
 
4.1%
93180
 
1.5%
83180
 
1.5%
73180
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number150785
70.2%
Other Punctuation64130
29.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
253530
35.5%
034980
23.2%
116960
 
11.2%
312720
 
8.4%
411130
 
7.4%
58745
 
5.8%
93180
 
2.1%
83180
 
2.1%
73180
 
2.1%
63180
 
2.1%
Other Punctuation
ValueCountFrequency (%)
/64130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common214915
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/64130
29.8%
253530
24.9%
034980
16.3%
116960
 
7.9%
312720
 
5.9%
411130
 
5.2%
58745
 
4.1%
93180
 
1.5%
83180
 
1.5%
73180
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII214915
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/64130
29.8%
253530
24.9%
034980
16.3%
116960
 
7.9%
312720
 
5.9%
411130
 
5.2%
58745
 
4.1%
93180
 
1.5%
83180
 
1.5%
73180
 
1.5%

Confirmed
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct5034
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5044.945579
Minimum0
Maximum1577147
Zeros10302
Zeros (%)32.1%
Negative0
Negative (%)0.0%
Memory size250.6 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median21
Q3460
95-th percentile12288
Maximum1577147
Range1577147
Interquartile range (IQR)460

Descriptive statistics

Standard deviation44878.33535
Coefficient of variation (CV)8.89570257
Kurtosis637.011566
Mean5044.945579
Median Absolute Deviation (MAD)21
Skewness22.83271239
Sum161766180
Variance2014064984
MonotonicityNot monotonic
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010302
32.1%
11116
 
3.5%
3466
 
1.5%
2426
 
1.3%
18335
 
1.0%
11332
 
1.0%
4305
 
1.0%
6299
 
0.9%
5297
 
0.9%
16256
 
0.8%
Other values (5024)17931
55.9%
ValueCountFrequency (%)
010302
32.1%
11116
 
3.5%
2426
 
1.3%
3466
 
1.5%
4305
 
1.0%
5297
 
0.9%
6299
 
0.9%
7228
 
0.7%
8236
 
0.7%
9217
 
0.7%
ValueCountFrequency (%)
15771471
< 0.1%
15518531
< 0.1%
15285681
< 0.1%
15083081
< 0.1%
14867571
< 0.1%
14678201
< 0.1%
14428241
< 0.1%
14177741
< 0.1%
13904061
< 0.1%
13693761
< 0.1%

Deaths
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct1634
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean335.5694995
Minimum0
Maximum94702
Zeros17916
Zeros (%)55.9%
Negative0
Negative (%)0.0%
Memory size250.6 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q36
95-th percentile363.8
Maximum94702
Range94702
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3095.690148
Coefficient of variation (CV)9.225183318
Kurtosis378.8853571
Mean335.5694995
Median Absolute Deviation (MAD)0
Skewness17.17990218
Sum10760036
Variance9583297.495
MonotonicityNot monotonic
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
017916
55.9%
12326
 
7.3%
21286
 
4.0%
31079
 
3.4%
6779
 
2.4%
4612
 
1.9%
7485
 
1.5%
8415
 
1.3%
5337
 
1.1%
9305
 
1.0%
Other values (1624)6525
 
20.3%
ValueCountFrequency (%)
017916
55.9%
12326
 
7.3%
21286
 
4.0%
31079
 
3.4%
4612
 
1.9%
5337
 
1.1%
6779
 
2.4%
7485
 
1.5%
8415
 
1.3%
9305
 
1.0%
ValueCountFrequency (%)
947021
< 0.1%
934391
< 0.1%
919211
< 0.1%
903471
< 0.1%
895621
< 0.1%
887541
< 0.1%
875301
< 0.1%
858981
< 0.1%
841191
< 0.1%
823561
< 0.1%

Recovered
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct2978
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1525.388056
Minimum0
Maximum298418
Zeros16088
Zeros (%)50.2%
Negative0
Negative (%)0.0%
Memory size250.6 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q392
95-th percentile2507.8
Maximum298418
Range298418
Interquartile range (IQR)92

Descriptive statistics

Standard deviation10978.55868
Coefficient of variation (CV)7.197223449
Kurtosis207.9480819
Mean1525.388056
Median Absolute Deviation (MAD)0
Skewness12.64965022
Sum48911568
Variance120528750.7
MonotonicityNot monotonic
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
016088
50.2%
11105
 
3.4%
2613
 
1.9%
4365
 
1.1%
3357
 
1.1%
5275
 
0.9%
8269
 
0.8%
6252
 
0.8%
10231
 
0.7%
7217
 
0.7%
Other values (2968)12293
38.3%
ValueCountFrequency (%)
016088
50.2%
11105
 
3.4%
2613
 
1.9%
3357
 
1.1%
4365
 
1.1%
5275
 
0.9%
6252
 
0.8%
7217
 
0.7%
8269
 
0.8%
9153
 
0.5%
ValueCountFrequency (%)
2984181
< 0.1%
2943121
< 0.1%
2893921
< 0.1%
2831781
< 0.1%
2722651
< 0.1%
2683761
< 0.1%
2507471
< 0.1%
2464141
< 0.1%
2434301
< 0.1%
2327331
< 0.1%

Interactions

Correlations

Spearman's ρ

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Count
A simple visualization of nullity by column.
Matrix
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Dendrogram
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Province/StateCountry/RegionLatLongDateConfirmedDeathsRecovered
0NaNAfghanistan33.000065.00001/22/20000
1NaNAlbania41.153320.16831/22/20000
2NaNAlgeria28.03391.65961/22/20000
3NaNAndorra42.50631.52181/22/20000
4NaNAngola-11.202717.87391/22/20000
5NaNAntigua and Barbuda17.0608-61.79641/22/20000
6NaNArgentina-38.4161-63.61671/22/20000
7NaNArmenia40.069145.03821/22/20000
8Australian Capital TerritoryAustralia-35.4735149.01241/22/20000
9New South WalesAustralia-33.8688151.20931/22/20000

Last rows

Province/StateCountry/RegionLatLongDateConfirmedDeathsRecovered
32055NaNMalawi-13.25430834.3015255/21/2072327
32056Falkland Islands (Malvinas)United Kingdom-51.796300-59.5236005/21/2013013
32057Saint Pierre and MiquelonFrance46.885200-56.3159005/21/20101
32058NaNSouth Sudan6.87700031.3070005/21/2048140
32059NaNWestern Sahara24.215500-12.8858005/21/20606
32060NaNSao Tome and Principe0.1863606.6130815/21/2025184
32061NaNYemen15.55272748.5163885/21/20197330
32062NaNComoros-11.64550043.3333005/21/203418
32063NaNTajikistan38.86103471.2760935/21/202350440
32064NaNLesotho-29.60998828.2336085/21/20100